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Abstract. This is the report of a project with the aim to make a new im- 
plementation of the Schreier-Sims algorithm in GAP, specialized for matrix 
groups. The standard Schreier-Sims algorithm is described in some detail, 
followed by descriptions of the probabilistic Schreier-Sims algorithm and the 
Schreicr-Todd-Coxeter-Sims algorithm. Then we discuss our implementation 
and some optimisations, and finally we report on the performance of our im- 
plementation, as compared to the existing implementation in GAP, and we give 
benchmark results. The conclusion is that our implementation in some cases 
is faster and consumes much less memory. 
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CHAPTER 1 



Introduction 

The following report makes up one the two parts of a project in computational 
group theory, the other part being a software package for the computer system 
GAP (see GAP ), which can be found on the WWW at the URL given on the title 
page. This text will describe the mathematics that provide the foundations of the 
package, including the algorithms used and their complexity, and also some of the 
more computer science oriented aspects, like what datastructures that were used, 
and how the implementation was done. 

Computational group theory (CGT) is an area of research on the border between 
group theory and computer science, and work in CGT is often of both theoretical 
(mathematical) and practical (programming) nature, leading to both theoretical 
results (mathematical theorems and proofs) and practical results (software), and 
this project is no exception. Introductory surveys of CGT can be found in Sim03 , 
Ser97j, |JNeu95| and |CH92| . 

The aim of the project was to make a GAP package with an implementation of 
the Schreier-Sims algorithm for matrix groups. The Schreier-Sims algorithm com- 
putes a base and a strong generating set for a group, and an implementation of 
this fundamental algorithm is already included in the standard GAP distribution, 
but that implementation always first computes a faithful action (ie a permutation 
representation) of the given group and then executes the algorithm on the resulting 
permutation group. The idea for this project was to restrict attention to matrix 
groups, and implement a version of the algorithm which works with the matrices 
directly, and see if one can obtain a more efficient implementation in this way. 

A survey of computational matrix group theory can be found in |NP01j . It 
should be noted that we are only interested in finite groups, ie matrix groups over 
finite fields, and therefore there is no need to worry about any noncomputability 
or undecidability issues. 

We will begin with a quick reference of the basic concepts from group theory and 
computer science that are being used, before moving on to describe the Schreier- 
Sims algorithm. The description will be quite detailed, and then we will turn to 
the variants of the algorithm that have also been implemented in the project: the 
random (ie. probabilistic) Schreier-Sims algorithm and the Schreier-Todd-Coxeter- 
Sims algorithm. After that we will say something about the implementation, and 
describe some tricks and improvements that have been done to make the algorithm 
faster. Finally, the practical performance and benchmark results of the implement- 
ation will be shown, and compared to the existing implementation in GAP. 

It must be mentioned that a report similar to this one is Mur93 , from which 
a fair amount of inspiration comes. 
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CHAPTER 2 



Preliminaries 

The definitions and statements in this section are assumed to be known, but we 
state them anyway, since authors often use different notation and sometimes put 
slightly different meaning to some of the following concepts (eg. the exact definition 
of graphs tend to vary). 

1. Graph theory 

First some graph theory, where we follow [Ejg89 . 

Definition 2.1. A directed graph is an ordered pair G = (V,E) where V is a 
finite non-empty set, the vertices of G and E C V x V is the edges of G. 

Remark 2.2. A graph in the sense of 12. H is sometimes called a combinatorial 
graph in the literature, to emphasize that they are not metric graphs in the sense 
of BH99 . As we are interested only in finite graphs and not in geometry, we do 
not make use of this nomenclature. 

Remark 2.3. The definition implies that our graphs have no multiple edges, 
but may have loops. We will henceforth omit the word "directed" since these are 
the only graphs we are interested in. 

Definition 2.4. If G = (V, E) is a graph, a sequence v = v\,v%, . . . ,v n = u of 
vertices of G such that (vi, w^+i) G E for i = 1, . . . ,n — lis called a walk from v to 
u. The length of a walk v — v\, . . . ,Vk — u is k — 1. If v = u then the walk is called 
a cycle. If all vertices in the walk are distinct, it is called a path. If every pair of 
distinct vertices in G can be joined by a path, then G is connected. 

Definition 2.5. A graph G' — (V, E') is a subgraph of a graph G = (V, E) if 
V CV and E' C E. 

Definition 2.6. A tree is a connected graph without cycles. A rooted tree is a 
tree T = (V, E) where a vertex r £ V have been specified as the root. 

Definition 2.7. If G = (V, E) is a graph, then a tree T is a spanning tree of 
G if T is a subgraph of G and T have vertex set V. 

The following are elementary and the proofs are omitted. 

Proposition 2.8. In a tree T = (V,E) we have \E\ = \V\ - 1 and there exists 
a unique path between every pair of distinct vertices. Conversely, if G is a graph 
where every pair of distinct vertices can be joined by a unique path, then G is a 
tree. 

Proposition 2.9. Every graph has a spanning tree. 
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2. PRELIMINARIES 



Definition 2.10. Let T = (V, E) be a rooted tree with root r G V. The depth 
of a node x G is the length of the path from x to r. The height of T is the 
maximum depth of any node. 

Definition 2.11. If G = (V,E) is a graph, then a labelling of G is a function 
w : E L where L is some set of " labels" . A labelled graph is a graph with a 
corresponding labelling. 

2. Group theory 

Now some group theory, where we follow our standard references |BB96| and 
|Ros94|. 

Note 2.12. All groups in this report are assumed to be finite. 

Definition 2.13. If G = (S) is a group, then the Cayley graph Cg{S) is the 
graph with vertex set G and edges E = {(g, sg) \ g € G,s € S}. 

Definition 2.14. An action of a group G on a finite set A is a homomorphism 
A : G — > Sym(X) (where Sym(A) is the group of permutations on X). If A is 
injective, the action is faithful. 

Remark 2.15. Following a convention in computational group theory, actions 
are from the right, and X(g)x is abbreviated with x 9 , for g € G and a; £ A. The 
elements of A are called points. Note that some of the rules for exponents hold 
since we have an action, eg (x 9 ) h = x gh . 

Definition 2.16. Let G be a group acting on the finite set A. For each point 
a G X, the orbit of a is a G = {/3GA|/3 = a! 9 ,gG G} and the stabiliser of a is 
G a = {g e G \ a 9 = a}. 

The following are elementary and the proofs are omitted. 

Proposition 2.17. Let G be a group acting on the finite set A. For eachp G X 
we have G p < G, and so we can define inductively 

where n > 1 and ai,...,a„ 6 I. 

Proposition 2.18. Lei G be a group acting on the finite set X. For eachp G A, 
the map fi p : G/G p — > p given by 

(2.2) G p5 ^ p 9 

/or eac/i g ^ G, is a bisection. In particular, |p G | = [G : G p ]. 

3. Computer science 

When it conies to computer science, our standard reference is |CLR90] where 
all basic computer science notions that are used here can be found. First of all, 
complexity analysis of algorithms and its asymptotic notation, in particular the 
0(-)-notation, is assumed to be known. Basic graph algorithms like breadth-first 
search, computation of connected compontents and spanning tree algorithms are 
also assumed to be known. 

Hash tables will likewise be used without further explanation. Even though the 
code in the project does not use hashing explicitly, but rely on GAP for that, it is 
worth mentioning that, to the author's knowledge, the best general-purpose hash 
function known to humanity is described in |Jen97| . 
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The Schreier-Sims algorithm 

We will now describe the Schreier-Sims algorithm, and the references we are us- 
ing are |CSS99| . |Ser03j and |But91j . First we have to describe some background 
and clarify the problem that the algorithm solves. 

1. Background and motivation 

The overall goal for work in group theory is to understand groups, and to answer 
various questions about groups. In particular, in CGT the interest is focused on 
questions of algorithmic nature, so given a group G we are interested in algorithms 
for things like the following: 

• What is |G|? 

• List all elements of G, without repetition. 

• If G < H and we are given an arbitrary g G H , is it true that g G G? This 
is referred to as the membership problem. 

• Generate a random g G G. 

• Is G abelian (soluble, polycyclic, nilpotent)? 

• Find representatives for the conjugacy classes of G. 

• Find a composition series for G, including the (isomorphism classes of the) 
composition factors. 

• Given g € G or H < G find the centralizer of g or the normalizer of H , 
respectively. 

To accomplish these tasks we need a computer representation of G, ie a datastruc- 
ture. A common way in which a group is given is via a generating set, but this 
alone does not help in solving our problems, so we need a better representation. It 
must be possible to compute this representation from a generating set, and a nice 
property would be if subgroups of G in some direct way inherit the representation, 
so that divide- and- conquer techniques could be used when designing algorithms 
(see |CLR90p . 

2. Base and strong generating set 

Consider the situation where we have a chain of subgroups of G 
(2.1) G = G° > G 1 > • ■ ■ > G" = 1 

Each g G G can be written g = g\U\ where u\ is a representative of G 1 g and g\ G G 1 , 
and inductively we can factorize g as g — u„it„_i ■ ■ ■ Ui, since the subgroup chain 
reaches 1. Moreover, this factorization is unique since for each i the cosets of G l+1 
partition G l , and by the same reason, different group elements will have different 
factorizations. 
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3. THE SCHREIER-SIMS ALGORITHM 



We thus see that if we know generating sets for the subgroups in such a subgroup 
chain, and if we know a right transversal T % of the cosets of G 1+1 in G 1 for each i = 
0, . . . , n— 1, then we could easily solve at least the first two listed problems listed in 
section^ By Lagrange we know |G| = jT 1 ) | C 1 1 and inductively |G| = jT 1 ) • • ■ |T"|, 
which solves the first problem. Using the factorization we have a bijection from G 
to T 1 x T 2 x ■ ■ • x T n so by enumerating elements of the latter set and multiplying 
in G we can list the elements of G without repetition. 

Now we introduce a special type of subgroup chain. 

Definition 3.1. Let G be a group acting on the finite set A. A sequence of 
points (ai, . . . , a n ) of A such that G ait . <*„ = 1 is called a base for G. Note that 
the base determines a stabiliser chain 

(2.2) G > G ai > • •• > Gai,...,a„ — 1 

Let G 1 — G Q1 ,...,o! i for i = 1, ... ,ri. A generating set S for G such that (5 n G 1 ) = 
G 1 for all i is called a strong generating set (SGS) for G. 

The concept of a base and strong generating was first introduced in |Sim70| 
in the context of permutation groups, and is of fundamental importance, though 
mainly for permutation groups. We already know that the first two of our problems 
can be solved if we know a subgroup chain and we shall see that with a base 
and strong generating set the membership problem can also easily be solved. The 
Schreier-Sims algorithm computes a base and strong generating set for a group G 
given a generating set S, and since it is an efficient algorithm if G is a permutation 
group, the concept of base and strong generating set has become very important. 
Many more sophisticated algorithms for permutation groups require a base and 
strong generating set as input. For matrix groups, on the other hand, the situation 
is a bit more complicated, as we shall see later. 

3. Schreier trees 

Before giving the Schreier-Sims algorithm itself, there are a few auxiliary al- 
gorithms that must be explained. Consider therefore a group G = (S) that acts 
on a finite set X. From the previous section we know that even if we have a base 
(ai, . . . ,a n ) for G, with corresponding stabiliser chain G > G 1 > • • ■ G n — 1, we 
also need to find right transversals of the cosets of G i+1 in G l for i = 1, . . . , n — 1. 
But since these groups are stabilisers from the action on A, we can use Proposition 
12. 181 and instead find the orbits af, , ■ ■ • , ■ We will see that the orbits are 

straightforward to compute. 

The action of G on A can be represented by a labelled graph with labels from 
S, analogous to the Cayley graph Cs(G) of G from Definition ^. 131 Let the vertices 
of the graph be A and the edges be {(p,p 9 ) | p £ X,g £ G}, where the edge (p,p s ) 
is labelled by g. Obviously the orbits of the action are the connected components of 
this graph, so af is the component containing u\. We are interested in finding this 
orbit and to store it in a datatstructure, and we are thus led to consider a spanning 
tree of the connected component, since such a tree contains enough information for 
us. The edges of the graph that are left out do not give us any additional relevant 
information about the action of G, only alternative ways to move between the 
points, and trees are considerably easier to store than graphs. 
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Definition 3.2. Let G be a group acting on the finite set X and let ael.A 
spanning tree rooted at a for the component in the corresponding graph containing 
a is called a Schreier tree for the orbit a G . 

The Schreier tree can be computed by a simple breadth-first search of the 
component containing a, and thus we have an algorithm for finding the orbits. 
However, it is of course not a computationally good idea to explictly generate the 
graph, then compute the connected components and finally find the Schreier trees. 
A breadth-first search to find the Schreier tree can be done without the graph itself, 
as demonstrated by | Algorithm 3.1] 

Algorithm 3.1: ComputeSchreierTree 
Data: A group G = (S) acting on a finite set X and a point a G X. 
Result: A Schreier tree for a G . 

/* Assumes the existence of a function Tree(x) that creates an empty tree 
with root x and a function AddChild(T, pi, p2, I) that adds p2 as a child 
to pi in the tree T, with label I. */ 
l begin 



2 points := {a} 

3 tree := Tree(ct) 

4 repeat 

5 children := 

6 for each p G points do 

7 foreach s G S do 

8 p' := p s 

9 if p 1 £ tree then 

10 AddChild(tree,p,p', s) 

n children := children U {p 1 } 

12 end 

13 end 

14 end 

15 points := children 

16 until points = 

17 return tree 



is end 



We noted above that we used Proposition ^ . 1 81 to store the orbits instead of the 
coset representatives, but we still need the latter, so we must be able to compute 
them. Fortunately, that is straightforward to do when we have a Schreier tree. 
Assume we have a Schreier tree T = (V, E) for the orbit a G for some point a G X . 
If g G G then a 9 G V and there is a path from a to a 9 in T. Let si, S2, ■ ■ ■ , s n be 
the labels of the path, so that Sj G S for i = 1, . . . , n, and let h = S1S2 • ■ ■ s n . Then 
obviously a h = a 9 so G a g — G a h and h is a coset representative for the coset of g. 
Moreover, h is unique since T is a tree. Thus, to find the coset representative for 
g we only have to follow the unique path from a 9 to the root a and multiply the 
edge labels. [Algorithm 3. 2| performs the slightly more general task of following the 
path from a given point to the root. 
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Algorithm 3.2: OrbitElement 





Data: A group G = (S) acting on a finite set A, a Schreier tree T for the 




orbit a G of the point a. £ X and an arbitrary point p G X . 




Result: The element g G G such that a 9 = p 




/*Assumes the existence of a function EdgeLabel(T, p) that returns the 




label of the unique edge between p and its parent in T * / 


1 


begin 


2 


<?:=! 


3 


while p 7^ a do 


4 


s := EdgeLabel(T,p) 


5 


p :=p s 


6 


9 ■= sg 


7 


end 


8 


return g 


9 


end 



We defer the complexity analysis of these algorithms until later. 

4. Formulating the problem 

To more formally state the problem solved by the Schreier-Sims algorithm, the 
following is needed. 

Definition 3.3. Let G be a group acting on the finite set X. A sequence of 
points B — (ai, . . . , a n ) of X and a generating set S for G, such that no element 
of S fixes all points of B, is called a partial base and partial strong generating set, 
respectively. 

Remark 3.4. A base and strong generating set as in Definition 13 . 21 are called 
complete. 

Remark 3.5. If we define G l = G au ... iOH , S l = S n G l and W = (S l ) for 
i = 1, . . . , n, we see that H n = 1 since no element of S fixes all points of B (and 
we use the convention that (0) = 1). We therefore have 

(4.1) G>G 1 >--->G n 

(4.2) G > H 1 > ■ ■ • > H n = 1 

Moreover, G l > = H l for i — l,...,n and if we have equality then, by 

Definition E3 S and B are complete. If h G H l+1 = (S' l+1 ) = (SnG l+1 ) then 
h = si ■ • • Sfc where c*2_i = cti+i for j = 1, . . . , k so a^ +1 = and therefore 

/i G H l a . +1 . Thus < H l a . +1 for i = 0, . . . , n - 1. 

Now our problem can be stated as follows: given a group G acting on the 
finite set X, together with a partial base B with points from X and partial strong 
generating set S, either verify that B is a (complete) base and that S is a (complete) 
strong generating set, or extend B and S so that they become complete. This is 
the problem that is solved by the Schreier-Sims algorithm. 

The following result from |Leo80| is used when designing the algorithm. 
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Theorem 3.6. Let G be a group acting on the finite set X, and let B = 
{at, . . . , a n ) be a partial base and S a partial strong generating set for G. Let also 
G i = G au ... iai> S* = Sr\G l , H* = (S l ) fori = l,...,n and G = G° = H°. Then 
the following statements are equivalent: 

(1) B and S are complete. 

(2) G l =H t fori = 0,...,n. 

(3) W ai+l =H i+1 fori = 0,...,n-l. 



(4) [H l : H i+1 ] 



a F+t 



for i = 0, . . . ,n — 1. 



Proof. From Remark 13.51 we know that Q and © are equivalent. Assuming 
(0) we have 

(4.3) W ai+1 = Gi i+1 = G* +1 = 

for i = 0, . . . , n — 1 which is precisely ©. If we instead assume Q and also assume 
for induction that G % = H l (the base case G = H° = G° is ok) then 

(4.4) G l+1 = Gi i+1 = Hi i+t = H l+l 

so by induction we get G l = if* for i = 0, . . . , n, which is J2J- 

Now assume © and note that from 12.151 we have [H 1 : ] — Oif +1 , so 

since = H L+1 we get |@}. Finally, assume ©. From Remark 13.51 we know 

H^ i+1 > H l+1 so if we again use l2~THl we get a?^ = [W : H i+1 ] > [H l : H l a%+1 ] = 

tfi x and thus £T* <+1 = H l+1 . □ 

As observed earlier, we are often given a group in the form of a generating 
set, but Schreier-Sims algorithm requires a partial base and a partial strong gener- 
ating set as input. Those are easy to compute, though, using |Algorithm 3~3] The 
algorithm also makes sure that the partial strong generating set is closed under in- 
verses and does not contain the identity, which removes the need to consider some 
special cases later on. We will see how the function NewBasePoint that is used in 
[Algorithm 3.3| can be implemented when G is a matrix group. 

5. Schreier's Lemma 

The name Schreier in Schreier-Sims algorithm comes from the following result, 
which in our case allows us to find a generating set for a stabiliser. It first appeared 
in |Sch27| . and our proof is originally from |Hal59j . 

Theorem 3.7 (Schreier's Lemma). Let G = (S) be a group, let H < G and let 
T be a right transversal of the cosets of H in G. For g £ G, let g € T be the unique 
element such that Hg = Hg. Then H is generated by 

(5.1) s H = {tsiuy 1 \ teT,seS} 

Proof. Without loss of generality we can assume that 1 G T (the coset rep- 
resentative of H itself). By definition, Hts = Hts which implies that ts(ts) -1 G H 
for alH G T, s G S. Hence, Sh Q H and (Sh) < H, so the content of the statement 
lies in the other inclusion. 

Let h G H < G and observe that since (S) = G we have h = S1S2 ■ ■ ■ Sfc where 
Si G S U S^ 1 for i = 1, . . . , k. Define a sequence t\, ti, ■ ■ ■ , tk+i of k + 1 elements of 
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Algorithm 3.3: GetPartialBSGS 

Data: A group G — (S) acting on a finite set X and a sequence of points B 

of X (possibly empty). 
Result: A partial base B' and partial strong generating set S' for G. 
/^Assumes the existence of a function NewBasePoint(g) that returns a 
point p G X such that p 9 ^ p */ 
l begin 



2 base := B 

3 sgs := 

4 foreach s € S \ { 1 } do 

5 if base s = base then 

6 point := NewBasePoint(s) 

7 base := base U {point} 

8 end 

9 sgs := sgs U {s, s -1 } 

10 end 

11 return (base, sgs) 



12 end 



T as follows: t\ = 1 and inductively tj+i = tiSi. Furthermore, let en = tiSit i+ i for 
i = 1 , . . . , n and observe that 

(5.2) h = {t 1 s 1 t2 1 )(t 2 s 2 t^ 1 ) ■ ■ ■ (t n s n t~\ x )t n+ i = aia 2 • ■ ■ a n t n+1 

We now show that a; € (<Si?) for i = 1, . . . , n, and that t n +\ — 1, which implies 
that H < (S H )- 

For each i = 1, . . . , n, either Si £ S or s^ 1 G 5*. In the first case we immediately 
get a, = M^Mi) -1 G Sh, and in the second case we have Hti+is^ 1 — HtiSiS^ 1 — 
HtiSiS^ 1 — HU which implies that U = ii+is^ 1 . Hence, a^ 1 = ti+is^ 1 ^ 1 = 

U+is^ 1 (ti+is^ 1 } G Sh and thus a, G (5^)- 

Finally, since h G H and (Sh) < H we have = (aia2 • • • a n )~ l h G i? , so 
t n+ i is the coset representative of H, and therefore t n +i = 1. Thus < (5^). □ 

Our situtation is that we have a group G = (S 1 ) acting on the finite set X, and 
we want to use Schreier's Lemma to find the generators (usually called the Schreier 
generators) for the stabiliser G a , where a G X. If we compute a Schreier tree for 
the orbit a G using [Algorithm 3.1| then we know that [Algorithm 3~2] can be used to 
find the transversal of the cosets of G a in G. 

For p G X, let t(p) G G denote the result of [Algorithm 3.2j Using the notation 
in Theorem 13.71 we then have g — t(a 9 ) and the transversal is {t(p) \ p G a G }, so 
for s G S and p G a the Schreier generator can be expressed as 

(5.3) t(p)st(a* (p)s )- 1 = i(p)st((a t ^) s )- 1 = <(p)si(p s )- 1 
and a generating set for G a is 

(5.4) {t^st^y 1 \pea G ,s E S} 
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5.1. Computing a base and SGS. Going back to our problem, if we have 
a partial base B — (ai, . . . , a n ) for G with corresponding partial strong generating 
set S, then we can use Schreier's Lemma to solve our problem. Using the notation 
from Theorem l3.6l we calculate the Schreier generators for each G l , using (|5.4|l and 
add them to S, possibly adding points to B, if some Schreier generator fixes the 
whole base, in order to ensure that B and S are still partial. When this is finished, 
we have H l = G % for i — 1, . . . , n and thus B and S are complete. We can use 
[Algorithm 3.4| to calculate a Schreier generator. 



Algorithm 3.4: GetSchreierGenerator 
Data: A group G = (S) acting on a finite set X, a Schreier tree T for the 

orbit oF of the point a G X , a p G X and a generator s G S. 
Result: The Schreier generator corresponding to p and s 

1 begin 

2 ti := OrbitElement(T, p) 

3 t 2 := DrbitElement(T,p ;i ) 

4 return t\st2 

5 end 



However, there is a problem with this simple approach, in that the generating 
sets defined in (|5.4|l can be very large, and contain many redundant generators. 
A fraction of the Schreier generators is usually enough to generate the stabiliser. 
In |Hal59| it is shown that [G : G a ] — 1 of the Schreier generators for G a are 
equal to the identity. For instance, if for some point p G a G we have that (p,p s ) 
is an edge of the Schreier tree for a G , then the Schreier generator t(p)st(p s )^ 1 is 
the identity. Thus, the number of non-trivial Schreier generators can be as large as 
(l^l — 1)[G : G a ] + 1, though some of them may be equal to each other. 

In our case we calculate the Schreier generators for each G l , using S 1 ^ 1 in place 
of S, and since S 1 " 1 are precisely the calculated Schreier generators for G i_1 , the 
number of non-trivial Schreier generators for G l may be as large as 

(5.5) i+asi-ijnM+i 

3=0 

Since the orbit sizes are only bounded by \X\, we see that (|5.5|l is exponential in 
\X\, and therefore this method may not be efficient. 

5.2. Reducing the number of generators. It is possible to reduce the 
number of generators at each step, so that our generating set never grows too large. 
This can be done using [Algorithm 3.5| which is due to Sims, and which can also be 
found in |Sim03| and CSS99 . 

This algorithm reduces the generating set to size (If 1 ) G Q(|A"| 2 ). Wit h this 
algorithm, we can solve our problem using [Algorithm 3.6| and [Algorithm 3.7| This, 
however, is not the Schreier-Sims algorithm, which is a more clever and efficient 
method of performing the same things. 
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Algorithm 3.5: BoilSchreierGenerators 

Data: A group G acting on a finite set X, a partial base B = (oei, . . . , au) 
and corresponding partial strong generating set S for G, an integer 
1 < m < k. 

Result: A smaller partial strong generating set for G 
l begin 



2 for i := 1 to m do 

3 T := S 1 - 1 

4 foreach g 6 T do 

5 foreach h £ T do 

6 if af — ctl ^ oti then 

r S:=(S\{h})U{ghr 1 } 

8 end 

9 end 
10 end 

n end 

12 return S 



13 end 



Algorithm 3.6: ComputeBSGS 

Data: A group G — (S) acting on a finite set X. 

Result: A base and strong generating set for G 
l . begin 



2 (base, sgs) := GetPartialBSGS(5, 0) 

3 for i := 1 to \base\ do 

4 (base, sgs) := Schreier(base, sgs, i) 

5 sgs := BoilSchreierGenerators(base, sgs, i) 

6 end 

7 return (base, sgs) 



8 end 



6. Membership testing 

We now assume that we know a (complete) base B = (ai, . . . , a„) and a (com- 
plete) strong generating set S for our group G, and we present an efficient algorithm 
for determining if, given an arbitrary group element g, it is true that g £ G. The 
implicit assumption is of course that G < H for some large group H and that 
g G H, but since we are interested in matrix groups this is always true with H 
being some general linear group. 

This algorithm is used in the Schreier-Sims algorithm together with Theorem 
13.61 as we shall see later. 

Recall that if we have a base then there is an associated stabiliser chain, and as 
described in section [21 if g 6 G we can factorise g as a product of coset represent- 
atives g = u n u n -i ■ ■ -ui where Ui is the representative of G l g in G 1 " 1 . Moreover, 
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Algorithm 3.7: Schreier 

Data: A group G acting on a finite set X, a partial base B = (oei, . . . , au) 
and corresponding partial strong generating set S for G, an integer 
1 < i < k such that G 1 = H j for j = 0, . . . , i - 1. 
Result: Possibly extended partial base B = (ai, . . . , a m ) and 

corresponding partial strong generating S set for G such that 
G 3 =W for j = 0,...,i. 
/*Assumes the existence of a function NewBasePoint(g) that returns a 
point p6l such that p 9 ^ p * / 

l begin 



2 T := S"- 1 

3 tree := ComputeSchreierTree(T, a^) 

4 foreach p 6 do 

5 foreach s E T do 

6 gen := GetSchreierGenerator(tree,p, s) 

7 if gen 7^ 1 then 

8 S := 5U {gen^en^ 1 } 

9 if B« en = B then 

10 point := NewBasePoint(gen) 

n B:=Bl) {point} 

12 end 

13 end 

14 end 

15 end 

16 return (B, S) 



17 end 



if we for each i = 1, . . . , n have computed a Schreier tree T, for af then we can 
use | Algorithm 3.2] to compute each coset representative. 

More specifically, if g £ G then g = git(af) where, as before, t(p) is the output 
of |Algorithm 3.2| on the point p and g\ 6 G 1 . On the other hand, if g ^ G then 
either af ^ a G or g\ £ G\. To test if g e G we can therefore proceed inductively, 
and first check if a\ e a G and if that is true then test whether g\ 6 G\, This is 
formalised in [Algorithm 3.8| 

The terminology residue and level is introduced here, with obvious meanings. 
As can be seen, the algorithm returns the level at which it fails, as this is needed 
in the Schreier-Sims algorithm. Note that even if all n levels are passed, it might 
happen that the residue r ^ 1 at line !3.8l which also indicates that g ^ G. 

In the literature, [Algorithm 3.8] is usually referred to as sifting or stripping of 
the group element g. 

7. The main algorithm 

Finally, we can now present the Schreier-Sims algorithm itself. It uses a more ef- 
ficient method of reducing the number of Schreier generators considered, by making 
use of |Algorithm 3.8| 
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Algorithm 3.8: Membership 




Data: A group G acting on a finite set X, a base B — (pt-y, 


. . . , ct n ), a 




Schreier tree Ti for the orbit (Xj^y for each i = 0, . . . 


n — 1 , and a 




group element g. 






Result: A residue r and drop-out level 1 < I < n + 1. 




1 


begin 




2 


r := ,g 




3 


for i := 1 to n do 




4 


if o£ ^ T,_i then 




5 


return (r, i) 




6 


end 




7 


element := OrbitElement(Tj_i, a\) 




8 


r := r ■ element -1 




9 


end 




10 


return (r, n + 1) 




11 


end 





Using the notation from Theorem 13. 61 we want to show that H l a — H l+1 for 
each i, or equivalently that all Schreier generators for H l are in H t+1 . If we proceed 
from i = n— 1, . . . , instead of the other way, then for H n = 1 we obviously already 
have a base and strong generating set, so we can use | Algorithm 3.8| to check if the 
Schreier generators for H™~ 1 are in H n . 

When we have checked all Schreier generators for H^ 1 we then have a base 
and strong generating set for i?™ -1 by Theorem 13.61 since H™^ 1 = H n , and we 
can therefore proceed inductively downwards. This is shown in [Algorithm "3~9]and 
|Algorithm XlO| 



Algorithm 3.9: ComputeBSGS 

Data: A group G = (S) acting on a finite set X. 
Result: A base and strong generating set for G 
l . begin 



2 (base, sgs) := GetPartialBSGS(5, 0) 

3 for i := \base\ to 1 do 

4 (base, sgs) := SchreierSims(base, sgs, i) 

5 end 

6 return (base, sgs) 



7 end 



7.1. Matrix groups. As can be seen in the given algorithms, the main part 
of the Schreier-Sims algorithm is indepedent of the particular type of group, but we 
are interested in the situation where G < GL(d, q) for some d > 1 and some q = p r 
where p is a prime number and r > 1. Here, d is called the degree of G and is the 
number of rows (columns) of the matrices in G, and q is the finite field size, ie G 
contains matrices over GF(q) = ¥ q . In this setting, G acts faithfully from the right 
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Algorithm 3.10: SchreierSims 

Data: A group G acting on a finite set X, a partial base B = {a.\, . . . , au) 
and corresponding partial strong generating set S for G, an integer 
1 < i < k such that H^J 1 = £F for j = i + 1, . . . , k, and Schreier 

trees T- 7-1 for a^ 3 for j = i + 1, . . . , k. 
Result: Possibly extended partial base B — (a%, . . . , a m ) and corresponding 
partial strong generating S set for G such that H^J 1 = for 

j = i, . . . , m and Schreier trees T^ 1 for a^ J for j — i, . . . ,m. 
/*Assumes the existence of a function NewBasePoint(g) that returns a 
point p£l such that p 9 ^ p */ 
l begin 



2 gens := S' 1 

3 T % ~ 1 := ComputeSchreierTree(gens, «i) 

4 foreach p G af 1 do 

5 foreach s 6 gens do 

6 gen := GetSchreierGenerator(T 4_1 ,p, s) 

7 if gen ^ 1 then 

8 (residue, dropout) := Member ship(T l , . . . , T k , gen) 

9 if residue ^ 1 then 

10 S := 5U {ge^gen- 1 } 

n if dropout = k + 1 then 

12 point := NewBasePoint(gen) 

13 B:=BU {point} 

14 end 

15 for j := r to i + 1 do 

16 (5, 5) := SchreierSims(i?, 5, j) 

17 end 
is end 

19 end 

20 end 

21 end 

22 return (B, S) 



23 end 



on the vector space X = FJj, by multiplication of a row vector with a matrix. We 
denote the standard base in F^ by ei, e 2 , . . . , e<j. 

What remains to specify is the algorithm NewBasePoint, which depends on the 
particular point set X and the action that is used. To construct this algorithm, we 
observe that given a matrix M ^ I G G, if Mjj ^ for some i ^ j then ejM ^ e, 
so the row vector e, is a point moved by M. If M instead is diagonal, but not a 
scalar matrix, then Ma ^ Mjj for some i ^ j, and (ej + &j)M =/= e^ + ej so this 
row vector is moved by M. Finally if M is scalar, then M\i ^ 1 since M ^ I and 
thus eiM 7^ ei. This translates directly into | Algorithm 3.11| 
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Algorithm 3.11: NewBasePoint 

Data: A matrix M I € G < GL(d, q). 
Result: A row vector v £ such that vM =^ v 
l . begin 



2 for i := 1 to d do 

3 for j := 1 to d do 

4 if i 7^ j and Mij ^ then 

5 return e, 

6 end 

7 end 

8 end 

9 for i := 1 to d do 

10 for j := 1 to d do 

n if i ^ j and Ma ^ Mjj then 

12 return e, + ej 

13 end 

14 end 

15 end 

16 return e x 



17 end 



8. Complexity analysis 

We now analyse the time complexity of some of the given algorithms, but we 
will not be interested in space complexity. For the analysis we assume that the 
external functions Tree, AddChild and EdgeLabel, that depend on the particular 
datastructure used, take 0(1) time. This is a reasonable assumption and it is sat- 
isfied in the code for the project, which uses hash tables to implement the Schreier 
trees. We also assume that the datastructure used for sets is a sorted list, so that 
elements can be found, added and removed in logarithmic time using binary search. 
This assumption is satisfied in GAP. 

Moreover, since we are working in a matrix group G = {S} < GL(c?, q) that acts 
on the vector space X = F^, we know that the multiplication of two group elements 
takes Q(d 3 ) time and that the action of a group element on a point takes 9(d 2 ) 
time, under the assumption that the multiplication of two elements from F g takes 
0(1) time. This assumption is reasonable since we are mostly interested in the case 
when q is not too large, so that one field element can be stored in a machine word 
and manipulated in constant time. Then we also see that testing equality between 
two group elements takes 0(d 2 ) time and testing equality between two points takes 
0(d) time. 

Consider first [Algorithm 3TT| We know that it is a breadth-first search and we 
know that a simple breadth-first search on a graph Q — (V, E) takes 0(|V| + \E\) 
time. In our case we have \E\ = \V\ \S\ — \X\ \S\, so the edges are dominating. We 
also see that line 18.11 takes 0(c? 2 ) time and line 13.11 takes 0(log |5|) time since the 
size of the set children is bounded by \S\. The former line is therefore dominating, 
and thus we have that | Algorithm 3. H takes 0(|X| \S\ d 2 ) time. 
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For | Algorithm 3.2| we just note that in a tree with vertex set V, the depth of 
any node is bounded by \ V\. In our case the vertices of the Schreier tree is the points 
of the orbit, which may be the whole of X if the action is transitive. Therefore we 
see that [Algorithm 3~2] takes time Q(| X| d 4 ). 

It is obvious that [Algorithm 3.11| takes time 0(d 3 ), and it is also obvious that 
[Algorithm "3~4| takes time 0(|X| d 2 + d 3 ) = 0(\X\d 2 ). In [Algorithm 3.3| we see 
that line 13.31 takes time 0(1) because the base is just a list and can therefore 
be augmented in constant time. We see that line 13.31 takes time 0(\B\d 3 ) since 
we might have to multiply every base point with the group element and check 
equality, in the case where it actually fixes the base. Thus, [Algorithm 3.3| takes 
time 0(| S|log|S| \B\d 3 ) 

Since [Algorithm 3.7| is not used in the GAP package of the project, we skip 
the complexity analysis of it and its related algorithms. We also skip any detailed 
complexity of the Schreier-Sims algorithm itself, since we will see that for matrix 
groups, this is futile anyway. The case of permutation groups is analysed in But91| 
and the result is that the Schreier-Sims algorithm has time complexity 0(|X | ). 

The essential problem when using the algorithm for matrix groups is then that 
X = so \X\ = q d and the Schreier-S ims algorithm is thus exponential in d. 
This is quite disheartening, since it is the case when d grows large that we are 
interested in, more often than when q is large. However, it is possible to make the 
implementation fast enough for many practical purposes, so all is not lost. 



CHAPTER 4 



The probabilistic Schreier-Sims algorithm 



It is well-known in computer science that probabilistic algorithms often turn out 
to be much simpler in structure than the corresponding deterministic algorithms 
for the same problem, while still producing good solutions. That the algoritms 
are simpler usually means that they can be made more efficient and are easier to 
implement. 

On the other hand, the drawback with the probabilistic approach is that there 
is a non-zero probability that the algorithms in some sense can "lie", ie return in- 
correct solutions. An introduction to the subject and the various complexity classes 
can be found in |Pap94 . 

The probabilistic Schreier-Sims algorithm is a good example of this situation. 
It was first described in |Leo80| , and the idea comes from the following 

Theorem 4.1. Let G be a group acting on the finite set X, and let B = 
(«!,..., a*;) be a partial base and S a partial strong generating set for G. For 
i = l,...,n, let G l = G au ... tat , 5* = 5nG*, H* = (S l ), let G = G° = H° and let 
T l be a right transversal for the cosets of H l ~ x in H 1 ^ 1 . Then |n"=i^| divides 
\G\. 

Proof. Since \T l \ = [iT" 1 : H'ir 1 } we have 



(0.1) 




\i=l 

and thus the theorem follows. □ 

Corollary 4.2. Let G be a group acting on the finite set X , and let B = 
(at, ■ ■ • , ctfe) be a partial base and S a partial strong generating set for G. If B and 
S are not complete and g 6 G is a uniformly random element, then the probability 
that \ Algorithm 3~^ returns residue r ^ 1 when given g is at least 1/2. 

Proof. We see that even if B and S are not complete, we can compute Schreier 
trees for the orbits a^ +1 for i = 0, . . . , n — 1 and feed them to [Algorithm 3.8| which 
then tries to express g in terms of the corresponding right transversals T l for : 

Thus, [Algorithm 3. 8| checks if g e Ua =1 T l and by Theorem Ol if nLi T * is 
not the whole G, then it contains at most half of the elements of G. Therefore, since 
g is uniformly random, we have Pr[g ^ n"=i ^1/2- O 

This suggests a probabilistic algorithm for computing a base and a strong gen- 
erating set. Given a partial base and partial strong generating set for G, compute 
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Schreier trees and chose random elements uniformly from G. Use [Algorithm lOl on 
each element, and if it returns a non-trivial residue, add it to the partial SGS, and 
possibly augment the base. 

If the base and SGS are complete, then of course the residue will be trivial. On 
the other hand, if the base and SGS are not complete, then by Corollary 14.21 the 
probability that k consecutive random elements have trivial residues is less than 
2 . We can thus choose a large enough k for our purposes and assume that the 
base and SGS are complete when we have stripped k consecutive random elements 
to the identity. This is formalised in [Algorithm 4.l| 



Algorithm 4.1: RandomSchreierSims 

Data: A group G acting on a finite set X, a partial base B = (ai, . . . , 

and corresponding partial strong generating set S for G, an integer 
m > 1 and Schreier trees T^ 1 for for i = 1, . . . , k. 

Result: Possibly extended partial base B = (ax, ■ ■ ■ , ot n ) and corresponding 
partial strong generating S set for G such that to consecutive 
random elements have been stripped to identity with respect to B 
and S. 

/^Assumes the existence of a function NewBasePoint(g) that returns a 
point p E X such that p 9 ^ p. */ 

/^Assumes the existence of a function Random(G) that returns a uniformly 
random element from the group G. */ 
l begin 



2 sifts := 

3 while sifts < to do 

4 element := Random(G) 

5 (residue, dropout) := Membership^ 1 , . . . , T k , element) 

6 if residue ^ 1 then 

7 S :— S U {element, element -1 } 

8 if dropout = k + 1 then 

9 point := NewBasePoint(element) 
10 B := B U {point} 

n k := k + 1 

12 end 

13 for i := 1 to k do 

14 T l := ComputeSchreierTree(iS' z , Qi) 

15 end 

16 sifts := 

17 else 

is sifts := sifts + 1 

19 end 

20 end 

21 return (B, S) 



22 end 



1. RANDOM GROUP ELEMENTS 
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Indeed, [Algorithm 4.1| is simpler than |Algorithm 3.10| and implementations are 
usually faster. However, it is not clear how the algorithm Random should be con- 
structed. First of all, we need random bits, and the generation of pseudo-random 
bits is an old problem in computer science, see |Pap94| and |Knu97j . which we will 
not dwell into. Instead, we will assume that uniformly random bits are available. 

1. Random group elements 

The generation of random group elements is also difficult in general, and an 
introduction to the topic can be found in Bab97 . If one knows a base and strong 
generating set one can easily generate random elements by using the factorisation 
described in section select random elements from each transversal and multiply 
them. In our case we do not yet have a base and an SGS, since that is what we are 
trying to compute, and then the only known algorithm that can provably generate 
(almost) uniformly random group elements is described in Bab91 and runs in 
time O(d 10 (\ogq) 5 ) for subgroups of GL(d,q), and this i s too slow f or our purposes. 

The practical algorithms are instead heuristics, see |ACG+99j . for which there 
are no proven guarantees for any uniformly randomness, but instead experiment- 
ation and statistical analysis has shown them to be good. It appears that the 
most successful algorithm is the product replacement algorithm, also called " Shake" , 
which is originally an idea by Charles Leedham-Green and Leonard Soicher, and is 
described in CL GM+95] , 

We will not dwell deeper into the analysis of this algorithm, but the algorithm 
itself is quite simple. Given a group G = (g\, ... , g n ), the "Shake" algorithm main- 
tains a global variable S — (<xi, . . . , a m ) G G m for some m > n, and then each call 
to [Algorithm 4. 2| returns a proposed random group element. 



Algorithm 4.2: Shake 





Data: The global state S — (oi, . . . , a m ) € G m 




Result: An element in G that hopefully is a good approximation of a 




uniformly random element from G. 




/*Assumes the existence of a function Randomlnteger(fc) that returns a 




uniformly random integer from the set {0, . . . , k} * / 


1 


begin 


2 


i := Randomlnteger(m) 


3 


repeat 


4 


j := Randomlnteger(m) 


5 


until i 7^ j 


6 


if RandomInteger(l) = then 


7 


b := OiCLj 


8 


else 


9 


b := ajCLi 


10 


end 


11 


di := b 


12 


return b 


13 


end 
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Evidently, the algorithm is very cheap in terms of time and space. The questions 
that remain are how the state S should be initialised and how m should be chosen, 
and the ability of the algorithm to generate uniformly random group elements are 
highly dependent on the answers. In ,CLGM + 95] the suggestion is that m = 
max(10, In + 1) and that S is initialised to contain the generators g±, . . . , g n , the 
rest being identity elements. Moreover, the state should be initiliased by calling 
|Algorithm 4.2| a number K of times and discarding the results, and the suggestion 
is that K > 60. 

In NP01 this algorithm is also described, as well as a variant of it, called 
"Rattle" , which is due to Leedham-Green. The algorithms are compared and "Rattle' 
is found to be slightly better, but its running time is a bit higher. In GAP, there is 
an implementation of the "Shake" algorithm, which is used in this project. An im- 
plementation of "Rattle" was also made, but it turned out that it was not efficient 
enough. 



CHAPTER 5 



The Schreier-Todd-Coxeter-Sims algorithm 



Coset enumeration is one of the oldest algorithms in group theory, and it was 
first described in |TC36| . G iven a finitely presented group G and H < G, the aim 
of coset enumeration is to construct the list of the cosets of H in G, ie to find a set 
X with a point p G X such that G acts transitively on X and such that H = G p . 
The set X is usually taken to be {1, . . . , n} for some n and p — 1. If the index of 
H is not finite, then the algorithm will usually not terminate. 

The coset enumeration is a trial-and-error process, and quite a few strategies 
have been developed over the years. We will not be concerned about the actual 
algorithm, since it is beyond the scope of this report, and there is a good imple- 
mentation in GAP, which is used in this project. A more detailed description of 
coset enumeration can be found in Sim03 . 

However, it is important to be aware of the overall structure of coset enumer- 
ation. It works by letting the elements of G act on X and introducing new cosets 
when necessary. Then it may happen that some cosets turn out to be the same and 
thus are identified in the list. The structure of the algorithm makes it possible to 
exit prematurely when a given number of cosets have been defined, and this can be 
used if one knows an upper bound on the number of cosets. 

In our case, we want to use the last case of Theorem I3.fi! At level i of Schreier- 
Sims algorithm, we know that the base and SGS are complete for higher levels, 
and want to verify that H l+1 = H l a . +i , which by the theorem is equivalent to 



[H l : H l+1 \ — otf +1 and since we can compute a Schreier tree for a^ +1 we know 

the orbit size. Thus, if we can compute [H l : H l+1 ] using coset enumeration and 
it turns out to be equal the orbit size, it is unnecessary to compute any Schreier 
generators and check for membership. 

All this is formalised in |Algorithm 5.1| where we use the option to exit the coset 
enumeration after M a^ +1 cosets have been defined, for some rational number 
M > 1, since this certainly is an upper bound. In our implementation, the number 
M can be specified by the user, and the default is M = 6/5. This value comes 
from |Leo80| . where the algorithm was introduced and some experimentation was 
carried out to determine a good value of M. 

The Schreier- Todd-Coxeter-Sims algorithm is known to perform particularly 
good when the input partial base and SGS are actually already complete. Thus, it 
can be used for verification of the output from a probabilistic algorithm, and in this 
project it is mainly used in that way, to verify the output from the probabilistic 
algorithm described earlier. 



23 



24 



5. THE SCHREIER-TODD-COXETER-SIMS ALGORITHM 



Algorithm 5.1: SchreierToddCoxeterSims 

Data: A group G acting on a finite set A, a partial base B — (ai, . . . , at) 
and corresponding partial strong generating set S for G, an integer 
1 < i < k such that H^T 1 = H 3 for j = i + 1, . . . , k, and Schreier 

trees T 3 ^ 1 for 1 for j = i + 1, . . . , k. 
Result: Possibly extended partial base B = (a\, . . . , a m ) and corresponding 
partial strong generating S set for G such that H 3 ^ 1 = H 3 for 

j = i, . . . ,m and Schreier trees T 3 " 1 for a^ 3 1 for j = i, . . . , m. 
/*Assumes the existence of a function NewBasePoint(g) that returns a 

point p E X such that p 9 ^ p. */ 
/*Assumes the existence of a function ToddCoxeter(J7i, U2, k) that 
performs coset enumeration on G = (U\) and H = (U2) < G, exiting 
when k cosets have been defined. */ 
1 begin 



2 gens := S 1 

3 y*- 1 ■— ComputeSchreierTree(gens, on) 

4 foreach p G af do 

5 foreach s G gens do 

e table := ToddCoxeter(S'% S l+1 , \T l \ + 1) 

7 if \table\ = \T l \ then 

8 return (B, S) 

9 end 

10 gen := GetSchreierGenerator(T 4_1 ,p, s) 

11 if gen 7^ 1 then 

12 (residue, dropout) := Member ship(T l , . . . , T k , gen) 

13 if residue ^ 1 then 

14 S := SU {gen^en^ 1 } 

15 if dropout = k + 1 then 

16 point := NewBasePoint(gen) 

17 B := B U {point} 
is end 

19 for j := r to i + 1 do 

20 (B,S) :— SchreierToddCoxeterSims(i?, £, j) 

21 end 

22 end 

23 end 

24 end 

25 end 

26 return {B, S) 



27 end 



CHAPTER 6 



Implementation and optimisation 



We will now consider some more practical issues regarding the actual imple- 
mentation and optimisation of the code in the project. 

In GAP, a very fast implementation of the Schreier-Sims algorithm already ex- 
ist s. It is descri bed in |Ser03j and is a heuristic based on the algorithm described 



in BCFAS91 . As with many other algorithms in GAP, it works only with per- 
mutation groups, so for a general group G it first calculates a faithful permutation 
representation, ie an injective homomorphism A : G — + Sym(A), for some finite set 
X, which exists due to the well-known theorem by Cay ley. It then work with the 
image A(G), or rather with H < SW| such that H = A(G). 

Maybe the most important motivation for the GAP package developed in this 
project is the idea that if G is a matrix group, and one work with the matrices 
directly instead of first converting to permutations, then perhaps one could use 
the additional structure, that is otherwise thrown away, and come up with a faster 
algorithm. The code is therefore created with the basic assumption that all group 
elements are invertible matrices over a finite field, and it uses the internal GAP rep- 
resentation of such matrices. 



1. Schreier tree creation 

An important issue is how the Schreier trees are managed. Actually, what we 
really want is the transversals defined by the Schreier trees, so a possibility is to 
store not a tree, but just a list of all points and for each point the element that 
moves the root to this point. This can be realised as a tree with height 1, where the 
edge labels are not the generators, as in the case of a Schreier tree, but the group 
elements, so in general this will take up more memory than the Schreier tree. On 
the other hand, to find a coset representative in a Schreier tree we need to follow a 
path from a point to the root, which takes logarithmic time on average, and may 
take linear time if the tree is not properly balanced, but in the case where we store 
the coset representatives as edge labels, then of course it takes constant time to 
find them later. 

To conclude, we have a trade-off between time and space, and in this project, 
both of the above strategies are available to the user of the package. The concrete 
representation of the trees are as hash tables, where the keys are the points and the 
values are the edge labels of the unique edges directed towards the root, ie we use 
" back-pointers" . This makes it possible to perform in constant time the common 
task of checking if a given point is in the orbit defined by a Schreier tree, since this 
is just a hash tabic lookup. 

1.1. Extending vs creating. If we consider |Algorithm 3.10| in more detail, 
then we notice that the sets S % are only augmented if they are changed. Therefore, 



25 



26 



6. IMPLEMENTATION AND OPTIMISATION 



at line 13. we could save the tree that is computed and, at the next time we arrive 
there, only extend the tree using the new generators, if there are any. 

This may or may not give a faster algorithm. Evidently, there will be less 
work in [Algorithm 3. 1| to extend a Schreier tree than to create a new one, but 
it may happen that the tree becomes more balanced if it is recomputed using all 
generators than if it is extended. If the tree is not balanced, then [Algorithm 3.2| 
will take more time, and this is in fact the function where most time is spent. In 
|Ser03[ it is claimed that this problem with unbalanced trees are more common for 
matrix groups, and empirical studies in this project has shown that in most cases, 
it is better to recompute the Schreier trees every time. 

1.2. Shallow trees. There are algorithms for the creation of Schreier trees 
that are guaranteed to make the Schreier trees shallow, ie balanced, so that the 
they have worst-case logarithmic height. This is crucial if one wants to have good 
worst-case complexity, and two algorithms are described in |Ser03j . one determ- 
inistic and one probabilistic. In this proj ect the determ inistic algorithm have been 
implemented, which is also described in |BCFAS91] , It is too complicated to be 
included here in more detail, but the essential idea is to choose a different set of 
edge labels, rather than the given generators. 



2. Orbit sizes 

As have been noted earlier, using the Schreier-Sims algorithm for a matrix 
group G < GL(g?, q) is in a sense doomed from the beginning, since the complexity 
is exponential in d when G acts on F^. One manifestation of doom in this case is 
that the orbits may become huge, something that does not happen for permutation 
groups. This will make our Schreier trees huge and a large amount of Schreier 
generators must be created, so the algorithm will be slow. 

2.1. Alternating actions. To avoid large orbits, one can use another action 
of G. However, only if the action is faithful is it a permutation representation of G, 
and this is needed if the Schreier-Sims algoritm is going to work, otherwise we will 
calculate a base and strong generating set for the quotient of G with the kernel of 
the action. 

In |But76| . a clever trick was introduced where base points are chosen altern- 
atingly as one-dimensional subspaces (lines) and vectors from those lines. When a 
vector v = (v\, . . . , Vd) € F^ is chosen as base point, it is preceded in the base by 
the line (v) . The action of G on the lines is known as the projective action and of 
course it is not faithful, but since the next point v is from the kernel this does not 
matter. Also, the subspace (v) has a canonical representative (1, V2V^ X , ■ ■ ■ , vav^ ), 
which is trivial to find from v. Therefore, in terms of time, the projective action is 
not particularly more expensive than the action on points. 

Now, if k = \v G \ then (v) G |w G <"> | = k and G^ v ) is the stabiliser of the line 

containing v. This implies that if u S v ^ then u is also on that line, so u = mv 
where m G ¥ q . We see that M — jm S ¥ q \ mv G v < v > } is a subgroup of F* and 
thus I = |u G <"> | = \M\ divides q - 1. 

Instead of one orbit of size k we therefore have two orbits of size k/l and I, 
and since k > k/l + I whenever I > 2 we will for example need to compute and 
sift fewer Schreier generators. On the other hand, our base will probably be longer 
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when using this trick, and empirical studies in this project has shown that it is not 
always a good idea, but it nevertheless implemented and can be used. 

2.2. Eigenspaces. Another method for producing smaller orbits is described 
in M095 . The idea is to choose the base points to be eigenvectors of the given 
generator matrices. 

Recall from linear algebra that the characteristic polynomial of a matrix A G 
GL(<i, q) is ca(x) = det(x/ — A) where I is the identity matrix, and an eigenvalue 
of A is a root of ca(x). Normally, and eigenvector is defined as a vector v G such 
that v A — \v where A is an eigenvalue of A, or equivalently v 3 ^ A ' = where g(x) is 
a linear factor of ca (x) . Here we use a more general version of the latter definition, 
where we allow factors of higher degree as well as linear factors of ca {x) . 

We see that for a linear factor g{x) = x — A and a corresponding eigenvector v, 
the size of the orbit is a divisor of q — 1, since if u G then u = v aA = a\v, 
and as before the possible values of a form a subgroup of F* . More generally, for 
a factor of ca(x) of degree m, the size of the orbit for an eigenvector v is 
bounded above by q rn — 1. To get the smallest possible orbits, we should therefore 
choose eigenvectors corresponding to factors of as small degree as possible, and 
linear factors are easy to find since they correspond to eigenvalues, which are easy 
to compute. 

Note that we have only considered the orbits of groups generated by single 
matrices, and the orbit of a matrix group G = (S) need not be small just because 
we choose as base point an eigenvector of one of the generators in S, but if we 
choose a base point that is an eigenvector of several generators, then it turns out 
that the orbit size are more often small. In M095 these issues are investigated 
and experimented with in some detail, and a heuristic for finding base points that 
will hopefully give small orbits is developed. This has also been implemented in 
GAP, and in this project it is possible to use that algorithm. However, it is not 
always a good idea to use it, since there is some overhead, and it is not certain that 
the orbit sizes will actually be smaller. 

3. Further developments 

The package contains implementations of two more advanced algorithms, the 
so-called Verify routine by Sims, which is an algorithm for verifying if a proposed 
base and SGS are complete, and the nearly linear time algorithm for finding a base 
and an SGS. The complete descriptions of these algorithms are beyond the scope 
of this report, but for completeness we include brief accounts on them. 

3.1. The Verify routine. This algorithm is due to Sims, and has never been 
published, but it is described in Ser03 . Given a group G = (S) acting on the finite 
set X, a point a G X and a subgroup H = (S') < G a , it checks whether H = G a . 
If this is not the case, the algorithm computes g G G a \ H to witness that. To check 
a whole proposed base and SGS, one then uses the third case of Theorem 13.61 and 
checks each level. 

The algorithm is quite involved, both theoretically and when it comes to im- 
plementing it. It has a recursive nature, inducting on S \ S", and it involves things 
like changing base points and computing block systems. 
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3.2. The nearly linear time algorithm. For permutation groups G acting 
on X, there is an algorithm for computing a base and an SGS that runs in nearly 
linear time, ie linear in \X\ except for some logarithmic factor of |X| an d |G|. It is the 
best known algorithm in terms of time complexity, and is described in BCFA S91 
as well as in |Ser03j and it is probabilistic. The complexity is achieved by using 
shallow Schrcier trees, a fast probabilistic verification algorithm, and by sifting 
only a few randomly selected Schreier generators. Otherwise, the algorithm is quite 
similar to the algorithms we have described earlier for computing a base and an 
SGS. 

The random Schreier generators are computed using random subproducts which 
are described in |Ser03| as well as by selecting random group elements, and the 
algorithm relies on some theorems relating the number of Schreier generators to 
compute to the given probability of correctness. 



CHAPTER 7 



Performance 

As observed earlier, one of the objectives of the project was to make an im- 
plementation that hopefully would be faster than the one already existing in GAP. 
To determine if this objective was met, the implementation has been benchmarkcd 
and compared with the built-in GAP implementation. 

The algorithm has been used to compute a base and SGS for some matrix 
groups that are easy to construct in GAP, and the generating sets that were used 
were the standard generating sets from the GAP library. The main test groups were 
classical groups: the general and special linear groups GL(d, q) and SL(d, q) and the 
general and special orthogonal groups GO(d, q) and SO(d, q), for various (small) d 
and q. The algorithm was also tested on some Suzuki groups Sz(q) (where q is a 
non-square power of 2) and some Ree groups Ree(q) (where q = 3 1+2m for some 
m > 0). 

Another, perhaps more realistic, test of the performance was done by running 
the algorithm on randomly formed sets of invertible matrices of a given dimension 
over a given finite field. 

The benchmarks was carried out on two quite standard PC computers, the 
first test on a computer with an AMD Athlon CPU running at 2 GHz and with 1 
GB of physical RAM, and the second test on a computer with an Intel Pentium 
4 CPU running at 2.8 GHz and also with 1 GB of physical RAM. It is likely that 
these were the only important parameters, since GAP is mainly CPU intensive, and 
swapping was avoided duringd the benchmark, so the hard disk speed should be a 
negligible factor. The GAP installation tests reported GAP4stones values of 194624 
and 253581, respectively. 

The details of the benchmarks is shown in the appendix. During the first test, 
the existing algorithm was the faster one for most groups, but for some groups our 
implementation was faster, most notably for the Suzuki groups. It should be noted 
that all running times for our implementation comes from using the same algorithm 
options, and it is possible to get better times, especially for the smaller groups, by 
elaborating with other options. The second test also indicated that the existing 
implementation is faster in most cases. 

In terms of memory, no rigorous benchmark has been performed, but during 
the above benchmark some simple checks of the amount of memory allocated by 
the GAP process were performed. It seemed that our implementation consumed 
less memory overall, and in some cases the difference was large. Indeed, since we 
wanted to avoid swapping during the benchmark, no larger Suzuki group than 
Sz(32) could be checked during the first test, since the existing implementation ran 
out of memory. Our implementation was far from having such problems, consuming 
no more than about 50 MB for the Suzuki groups. 

The conclusion must therefore be that the project is a small success. 
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Benchmark 



Here are details of the results of the first benchmark. The first column shows the 
test groups, as they are written in GAP, and the other columns show the execution 
time in milliseconds. The method used to compare the algorithms was to compute 
the order of the input group using the orbit sizes in the computed base and SGS, 
so to measure the existing implementation we used a command like 
gap> Size (Group (GeneratorsOf Group (GL (4, 4)))); 
gap> benchmark_time := time; 



Table 1: Benchmark results 



Group 



Project 



GAP 



SL(2,2) 



110 

20 
20 
80 
60 
40 
40 
80 
50 
50 
50 
20 
30 
70 
40 
50 
50 
50 
40 
50 
40 
170 



170 

10 



20 

10 



10 

30 



10 





10 

10 

10 

10 



10 



10 



20 



GO(+l,2,2) 
GO(-l,2,2) 
GL(2,4) 



SL(2,4) 



GO(+l,2,4) 
GO(-l,2,4) 
GL(2,3) 



SL(2,3) 



GO(+l,2,3) 
GO(-l,2,3) 
SO(+l,2,3) 
SO(-l,2,3) 
GL(2, 5) 



SL(2,5) 



GO(+l,2,5) 
GO(-l,2,5) 
SO(+l,2,5) 
SO(-l,2,5) 
SL(3,2) 



GO(0,3,2) 
GL(3,4) 
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Table 2: Benchmark results Table 3: Benchmark results 



Group 


Project 


GAP 


Group 


Project 


GAP 


SL(3,4) 


110 


20 


GL(5,3) 


890 


30 


GO(0,3,4) 


90 


10 


SL(5,3) 


5090 


130 


GL(3,3) 


70 


10 


GO(0,5,3) 


100 


40 


SL(3,3) 


100 





SO(0,5,3) 


150 


40 


GO(0,3,3) 


60 


10 


GL(5,5) 


24500 


410 


SO(0,3,3) 


50 





SL(5,5) 


12380 


1450 


GL(3,5) 


190 


60 


GO(0,5,5) 


470 


370 


SL(3,5) 


110 


40 


SO(0,5,5) 


390 


380 


GO(0,3,5) 


70 


10 


SL(6,2) 


1250 


40 


SO(0,3,5) 

\ J 7 / 


70 


20 


GO(+l,6,2) 

\ 1 7 7/ 


250 


20 


SL(4,2) 


130 


10 


GO(-l,6,2) 


350 


10 


GO(+l,4,2) 


40 


10 


GL(6,4) 


205430 


460 


GO(-l,4,2) 


60 





SL(6,4) 


89580 


4030 


GL(4,4) 


1480 


30 


GO(+l,6,4) 


3880 580 




SL(4,4) 


690 


80 


GO(-l,6,4) 

\ 7 7/ 


3700 


550 


GO(+l,4,4) 


90 


30 


GL(6,3) 


3850 


100 


GO(-l,4,4) 


170 


20 


SL(6,3) 


16890 


660 


GL(4,3) 


210 


30 


GO(+l,6,3) 


340 


90 


SL(4,3) 


410 


20 


GO(-l,6,3) 


340 


200 


GO(+l,4,3) 


80 


20 


SO(+l,6,3) 


520 


100 


GO(-l,4,3) 


80 


10 


SO(-l,6,3) 


480 


100 


SO(+l,4,3) 


80 





GL(6,5) 


292220 


84030 


SO(-l,4,3) 


80 


10 


SL(6,5) 


227830 


82010 


GL(4,5) 


1600 


50 


GO(+l,6,5) 


5410 


2050 


SL(4,5) 


790 


190 


GO(-l,6,5) 


1710 


1440 


GO(+l,4,5) 


100 


60 


SO(+l,6,5) 


3190 


1460 


GO(-l,4,5) 


140 


50 


SO(-l,6,5) 


3520 


1390 


SO(+l,4,5) 


140 


60 


Sz(8) 


180 


1170 


SO(-l,4,5) 


100 


60 


Sz(32) 


4390 


133860 


SL(5,2) 


560 


10 


Sz(128) 


1084510 


> 3600000 


GO(0,5,2) 


90 


10 


Ree(27) 


816590 


687780 


GL(5,4) 


26440 


220 








SL(5,4) 


4650 


440 








GO(0,5,4) 


230 


100 
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Here are the results of the second benchmark. The first column shows the finite 
field size, the second column the matrix dimension and the third column is the size 
of the random generating set that was formed. Each generating set was computed 
using the given number of calls to RandomlnvertibleMat. For each line in the 
table, 20 generating sets with the given parameters were formed, the algorithm was 
executed on these sets, and the average time over these 20 executions is shown in 
the table in the last two columns. 



Table 4: Benchmark results 



Field 


Dim 


Set 


Project 


GAP 


2 


2 


1 


25 


16 


2 


2 


2 


23 


4 


2 


2 


3 


23 


2 


2 


2 


4 


23 


2 


2 


2 


5 


23 


2 


2 


2 


6 


23 


2 


2 


2 


7 


23 


3 


2 


2 


8 


23 


2 


2 


2 


9 


24 


3 


2 


2 


10 


23 


3 


2 


3 


1 


34 


2 


2 


3 


2 


42 


3 


2 


3 


3 


49 


3 


2 


3 


4 


52 


3 


2 


3 


5 


53 


4 


2 


3 


6 


58 


4 


2 


3 


7 


60 


4 


2 


3 


8 


68 


4 


2 


3 


9 


62 


5 


2 


3 


10 


68 


4 


2 


4 


1 


46 


3 


2 


4 


2 


87 


4 


2 


4 


3 


117 


5 


2 


4 


4 


120 


6 


2 


4 


5 


126 


6 


2 


4 


6 


124 


7 


2 


4 


7 


145 


7 


2 


4 


8 


151 


7 


2 


4 


9 


161 


8 


2 


4 


10 


168 


8 
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Table 5: Benchmark results Table 6: Benchmark results 



Field 


Dim 


Set 


Project 


GAP 


Field 


Dim 


Set 


Project 


GAP 


2 


5 


1 


58 


3 


4 


4 


1 


75 


15 


2 


5 


2 


208 


10 


4 


4 


2 


430 


26 


2 


5 


3 


231 


11 


4 


4 


3 


576 


28 


2 


5 


4 


301 


12 


4 


4 


4 


589 


29 


2 


5 


5 


294 


13 


4 


4 


5 


749 


28 


2 


5 


6 


357 


14 


4 


4 


6 


714 


31 


2 


5 


7 


384 


15 


4 


4 


7 


838 


33 


2 


5 


8 


355 


15 


4 


4 


8 


912 


38 


2 


5 


9 


407 


16 


4 


4 


9 


1019 


36 


2 


5 


10 


471 


17 


4 


4 


10 


1085 


38 


2 


6 


1 


75 


4 


4 


5 


1 


109 


52 


2 


6 


2 


612 


24 


4 


5 


2 


3141 


170 


2 


6 


3 


530 


27 


4 


5 


3 


4248 


119 


2 


6 


4 


704 


28 


4 


5 


4 


3660 


106 


2 


6 


5 


1033 


32 


4 


5 


5 


5126 


116 


2 


6 


6 


903 


34 


4 


5 


6 


3895 


124 


2 


6 


7 


1209 


37 


4 


5 


7 


4416 


134 


2 


6 


8 


1055 


41 


4 


5 


8 


5154 


137 


2 


6 


9 


1300 


44 


4 


5 


9 


5002 


147 


2 


6 


10 


1158 


45 


4 


5 


10 


6285 


156 


2 


7 


1 


86 


8 


4 


6 


1 


193 


212 


2 


7 


2 


1229 


24 


4 


6 


2 


27943 


655 


2 


7 


3 


1675 


20 


4 


6 


3 


33664 


455 


2 


7 


4 


2289 


20 


4 


6 


4 


30882 


478 


2 


7 


5 


2183 


25 


4 


6 


5 


40467 


506 


2 


7 


6 


2404 


23 


4 


6 


6 


36702 


534 


2 


7 


7 


2556 


24 


4 


6 


7 


35179 


570 


2 


7 


8 


2477 


25 


4 


6 


8 


41211 


593 


2 


7 


9 


2963 


29 


4 


6 


9 


47733 


630 


2 


7 


10 


3343 


27 


4 


6 


10 


46698 


661 


4 


2 


1 


37 


3 


4 


7 


1 


8786 


19238 


4 


2 


2 


55 


4 


4 


7 


2 


335656 


95870 


4 


2 


3 


60 


5 


4 


7 


3 


337232 


100999 


4 


2 


4 


61 


5 


4 


7 


4 


371067 


98525 


4 


2 


5 


62 


5 


4 


7 


5 


381631 


94616 


4 


2 


6 


64 


6 


4 


7 


6 


463195 


94353 


4 


2 


7 


69 


7 


4 


7 


7 


463824 


102045 


4 


2 


8 


70 


7 


4 


7 


8 


460079 


104262 


4 


2 


9 


73 


7 


4 


7 


9 


485832 


115177 


4 


2 


10 


75 


8 


4 


7 


10 


497838 


99240 


4 


3 


1 


59 


5 


3 


2 


1 


39 


3 


4 


3 


2 


118 


12 


3 


2 


2 


47 


3 


4 


3 


3 


133 


14 


3 


2 


3 


49 


3 


4 


3 


4 


152 


17 


3 


2 


4 


52 


3 


4 


3 


5 


158 


19 


3 


2 


5 


51 


4 


4 


3 


6 


175 


21 


3 


2 


6 


53 


4 


4 


3 


7 


185 


22 


3 


2 


7 


46 


4 


4 


3 


8 


215 


25 


3 


2 


8 


54 


4 


4 


3 


9 


216 


27 


3 


2 


9 


54 


5 


4 


3 


10 


226 


28 


3 


2 


10 


47 


5 
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Table 7: Benchmark results Table 8: Benchmark results 



Field 


Dim 


Set 


Project 


GAP 


Field 


Dim 


Set 


Project 


GAP 


3 


3 


1 


59 


4 


5 


2 


1 


46 


3 


3 


3 


2 


92 


7 


5 


2 


2 


59 


5 


3 


3 


3 


98 


8 


5 


2 


3 


61 


6 


3 


3 


4 


101 


9 


5 


2 


4 


64 


6 


3 


3 


5 


111 


9 


5 


2 


5 


64 


8 


3 


3 


6 


114 


11 


5 


2 


6 


65 


9 


3 


3 


7 


126 


12 


5 


2 


7 


68 


9 


3 


3 


8 


126 


13 


5 


2 


8 


73 


10 


3 


3 


9 


136 


12 


5 


2 


9 


72 


10 


3 


3 


10 


143 


15 


5 


2 


10 


77 


11 


3 


4 


1 


74 


7 


5 


3 


1 


59 


8 


3 


4 


2 


244 


22 


5 


3 


2 


147 


14 


3 


4 


3 


278 


26 


5 


3 


3 


157 


15 


3 


4 


4 


310 


29 


5 


3 


4 


178 


15 


3 


4 


5 


366 


32 


5 


3 


5 


182 


14 


3 


4 


6 


334 


35 


5 


3 


6 


203 


15 


3 


4 


7 


415 


39 


5 


3 


7 


226 


17 


3 


4 


8 


452 


40 


5 


3 


8 


240 


17 


3 


4 


9 


475 


45 


5 


3 


9 


258 


19 


3 


4 


10 


454 


47 


5 


3 


10 


269 


20 


3 


5 


1 


82 


15 


5 


4 


1 


78 


35 


3 


5 


2 


1029 


44 


5 


4 


2 


957 


94 


3 


5 


3 


1110 


28 


5 


4 


3 


1123 


73 


3 


5 


4 


1692 


29 


5 


4 


4 


1038 


65 


3 


5 


5 


1361 


36 


5 


4 


5 


1300 


68 


3 


5 


6 


1612 


34 


5 


4 


6 


1337 


72 


3 


5 


7 


1752 


35 


5 


4 


7 


1563 


78 


3 


5 


8 


1887 


40 


5 


4 


8 


1703 


81 


3 


5 


9 


1928 


39 


5 


4 


9 


1815 


93 


3 


5 


10 


2197 


42 


5 


4 


10 


1864 


91 


3 


6 


1 


111 


44 


5 


5 


1 


166 


158 


3 


6 


2 


6517 


105 


5 


5 


2 


10330 


414 


3 


6 


3 


5772 


131 


5 


5 


3 


11236 


508 


3 


6 


4 


7463 


117 


5 


5 


4 
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